-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat
] Allow loading custom modules; encode kwargs passthrough to modules
#2773
Conversation
if class_ref.startswith("sentence_transformers."): | ||
return import_from_string(class_ref) | ||
|
||
if trust_remote_code: | ||
code_revision = model_kwargs.pop("code_revision", None) if model_kwargs else None | ||
try: | ||
return get_class_from_dynamic_module( | ||
class_ref, | ||
model_name_or_path, | ||
code_revision=code_revision, | ||
) | ||
except EnvironmentError: | ||
# Ignore the error if the file does not exist, and fall back to the default import | ||
pass | ||
|
||
return import_from_string(class_ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muellerzr I'd love to get your opinion on whether this might be problematic in terms of security. The first if
imports something from the Sentence Transformers module, which should be fine unless a malicious package is installed that uses sentence_transformers
as the import name.
The second if
imports the class via the transformers
remote code functionality. I assume this is safe.
The third if
imports whatever the class reference is, so this can also be a class in a malicious module. But the user would already have to have that malicious package installed. I'm not too concerned about this case, as preventing this case also prevents third parties from creating non-malicious custom ST Modules.
- Tom Aarsen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR looks good to me. I'll do some manual testing on this branch using different models to make sure everything works correctly
feat
] Allow loading custom modulesfeat
] Allow loading custom modules; encode kwargs passthrough to modules
Hello!
Pull Request overview
Details
This is an experimental branch to allow models with custom architectures to integrate with Sentence Transformers. Behind the scenes, Sentence Transformer models always rely on a list of modules, with configuration in
modules.json
. Usually, this contains aTransformer
followed by aPooler
and optionally aNormalize
and/orDense
module, but we can now fully change this up.I have two implementation formats that should both work:
forward
. Theforward
method accepts a dictionary of features from the tokenization, and then must output a dictionary. Some modules might outputsentence_embedding
, while others may outputtoken_embeddings
. The latter can then still be used with thePooling
module, which will convert the token embeddings into sentence embeddings.tomaarsen/jina-clip-implementation-st--custom_st.Transformer
, and this repository hosts custom_st.py. Otherwise, everything is the same as the other implementation.Note: you don't have to use
Transformer
orcustom_st
as names, you can be flexible here.Usage
You can now use these custom modules in Sentence Transformers directly via
trust_remote_code=True
:I believe this opens up a lot of opportunities for finetuning custom architectures, too. Even modalities not previously used with Sentence Transformers (e.g. audio?) should be totally feasible.
cc @bwanglzu